Introduction to Area Unit Data

Author

Serge Rey

Published

February 21, 2023

Areal Unit Data

Code
import geopandas
import libpysal
Code
south = libpysal.examples.load_example('South')
Code
libpysal.examples.explain('South')

Loading

Code
south_gdf = geopandas.read_file(south.get_path('south.shp'))

Plotting Geometries

Code
south_gdf.plot()

Checking CRS

Code
south_gdf.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

Turning off axis

Code
ax = south_gdf.plot()
ax.set_axis_off();

Inspecting the GeoDataFrame

Code
south_gdf.shape
(1412, 70)
Code
south_gdf.geometry
0       POLYGON ((-80.62805 40.39816, -80.60204 40.480...
1       POLYGON ((-80.52625 40.16245, -80.5876 40.1750...
2       POLYGON ((-80.52517 40.02275, -80.73843 40.035...
3       POLYGON ((-80.52447 39.72113, -80.83248 39.718...
4       POLYGON ((-75.7727 39.38301, -75.79144 39.7237...
                              ...                        
1407    POLYGON ((-79.14433 36.54606, -79.21706 36.549...
1408    POLYGON ((-79.43775 37.61596, -79.45834 37.603...
1409    POLYGON ((-80.12475 37.1251, -80.14045 37.1283...
1410    POLYGON ((-76.39569 37.10771, -76.4027 37.0905...
1411    POLYGON ((-77.53178 38.56506, -77.72094 38.840...
Name: geometry, Length: 1412, dtype: geometry
Code
south_gdf.columns
Index(['NAME', 'STATE_NAME', 'STATE_FIPS', 'CNTY_FIPS', 'FIPS', 'STFIPS',
       'COFIPS', 'FIPSNO', 'SOUTH', 'HR60', 'HR70', 'HR80', 'HR90', 'HC60',
       'HC70', 'HC80', 'HC90', 'PO60', 'PO70', 'PO80', 'PO90', 'RD60', 'RD70',
       'RD80', 'RD90', 'PS60', 'PS70', 'PS80', 'PS90', 'UE60', 'UE70', 'UE80',
       'UE90', 'DV60', 'DV70', 'DV80', 'DV90', 'MA60', 'MA70', 'MA80', 'MA90',
       'POL60', 'POL70', 'POL80', 'POL90', 'DNL60', 'DNL70', 'DNL80', 'DNL90',
       'MFIL59', 'MFIL69', 'MFIL79', 'MFIL89', 'FP59', 'FP69', 'FP79', 'FP89',
       'BLK60', 'BLK70', 'BLK80', 'BLK90', 'GI59', 'GI69', 'GI79', 'GI89',
       'FH60', 'FH70', 'FH80', 'FH90', 'geometry'],
      dtype='object')
Code
south_gdf.explore(column='HR60')
Make this Notebook Trusted to load map: File -> Trust Notebook
Code
south_gdf.HR60.describe()
count    1412.000000
mean        7.292144
std         6.421018
min         0.000000
25%         3.213471
50%         6.245125
75%         9.956272
max        92.936803
Name: HR60, dtype: float64
Code
ax = south_gdf.plot(column='HR60')
ax.set_axis_off();

How many states are there in this dataset

Code
south_gdf.STATE_NAME.unique().shape
(17,)

How many counties?

Code
south_gdf.shape[0]
1412

How many counties in each state?

Code
south_gdf.groupby(by='STATE_NAME').count()
NAME STATE_FIPS CNTY_FIPS FIPS STFIPS COFIPS FIPSNO SOUTH HR60 HR70 ... BLK90 GI59 GI69 GI79 GI89 FH60 FH70 FH80 FH90 geometry
STATE_NAME
Alabama 67 67 67 67 67 67 67 67 67 67 ... 67 67 67 67 67 67 67 67 67 67
Arkansas 75 75 75 75 75 75 75 75 75 75 ... 75 75 75 75 75 75 75 75 75 75
Delaware 3 3 3 3 3 3 3 3 3 3 ... 3 3 3 3 3 3 3 3 3 3
District of Columbia 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1 1
Florida 67 67 67 67 67 67 67 67 67 67 ... 67 67 67 67 67 67 67 67 67 67
Georgia 159 159 159 159 159 159 159 159 159 159 ... 159 159 159 159 159 159 159 159 159 159
Kentucky 120 120 120 120 120 120 120 120 120 120 ... 120 120 120 120 120 120 120 120 120 120
Louisiana 64 64 64 64 64 64 64 64 64 64 ... 64 64 64 64 64 64 64 64 64 64
Maryland 24 24 24 24 24 24 24 24 24 24 ... 24 24 24 24 24 24 24 24 24 24
Mississippi 82 82 82 82 82 82 82 82 82 82 ... 82 82 82 82 82 82 82 82 82 82
North Carolina 100 100 100 100 100 100 100 100 100 100 ... 100 100 100 100 100 100 100 100 100 100
Oklahoma 77 77 77 77 77 77 77 77 77 77 ... 77 77 77 77 77 77 77 77 77 77
South Carolina 46 46 46 46 46 46 46 46 46 46 ... 46 46 46 46 46 46 46 46 46 46
Tennessee 95 95 95 95 95 95 95 95 95 95 ... 95 95 95 95 95 95 95 95 95 95
Texas 254 254 254 254 254 254 254 254 254 254 ... 254 254 254 254 254 254 254 254 254 254
Virginia 123 123 123 123 123 123 123 123 123 123 ... 123 123 123 123 123 123 123 123 123 123
West Virginia 55 55 55 55 55 55 55 55 55 55 ... 55 55 55 55 55 55 55 55 55 55

17 rows × 69 columns

Which county had the highest median homicide rate in 1960?

Code
south_gdf[['STATE_NAME', 'HR60']].groupby(by='STATE_NAME').median()
HR60
STATE_NAME
Alabama 9.623977
Arkansas 4.704111
Delaware 4.228385
District of Columbia 10.471807
Florida 9.970306
Georgia 9.300076
Kentucky 5.235436
Louisiana 6.840286
Maryland 5.335208
Mississippi 8.919274
North Carolina 7.633043
Oklahoma 4.269126
South Carolina 7.509437
Tennessee 4.877751
Texas 4.326215
Virginia 6.672004
West Virginia 2.623226

Which county had the highest maximum homicide rate in 1960?

Code
south_gdf[['STATE_NAME', 'HR60']].groupby(by='STATE_NAME').max()
HR60
STATE_NAME
Alabama 24.903499
Arkansas 21.154427
Delaware 7.286472
District of Columbia 10.471807
Florida 40.744262
Georgia 53.304904
Kentucky 37.250885
Louisiana 18.243736
Maryland 14.327234
Mississippi 24.833923
North Carolina 25.660127
Oklahoma 17.088175
South Carolina 23.345940
Tennessee 20.894275
Texas 92.936803
Virginia 23.575639
West Virginia 11.482375

Intra-state dispersion

Code
south_gdf[['STATE_NAME', 'HR60']].groupby(by='STATE_NAME').std()
HR60
STATE_NAME
Alabama 4.742337
Arkansas 4.574625
Delaware 1.815562
District of Columbia NaN
Florida 7.990692
Georgia 7.906488
Kentucky 6.354316
Louisiana 4.189146
Maryland 4.064360
Mississippi 4.972698
North Carolina 4.596952
Oklahoma 4.231132
South Carolina 4.018644
Tennessee 4.354979
Texas 8.223844
Virginia 4.826707
West Virginia 2.773659
Code
sgdf = south_gdf[['STATE_NAME', 'HR60']].groupby(by='STATE_NAME').std()
Code
cv = sgdf / south_gdf[['STATE_NAME', 'HR60']].groupby(by='STATE_NAME').mean() * 100
Code
cv.sort_values(by='HR60', ascending=False)
HR60
STATE_NAME
Texas 144.992919
Kentucky 96.815524
West Virginia 93.234007
Arkansas 81.223752
Oklahoma 81.114430
Tennessee 75.426226
Georgia 73.774440
Maryland 71.898559
Florida 68.252692
Virginia 66.924041
Louisiana 59.994571
Mississippi 57.457024
North Carolina 57.013871
Alabama 49.070812
South Carolina 48.083524
Delaware 34.966796
District of Columbia NaN